Bulk RNA-seq generate 'interrupted' cells to interpolate scRNA-seq¶
The limited number of cells available for single-cell sequencing has led to 'interruptions' in the study of cell development and differentiation trajectories. In contrast, bulk RNA-seq sequencing of whole tissues contains, in principle, 'interrupted' cells. To our knowledge, there is no algorithm for extracting 'interrupted' cells from bulk RNA-seq. There is a lack of tools that effectively bridge the gap between bulk-seq and single-seq analyses.
We developed BulkTrajBlend in OmicVerse, which is specifically designed to address cell continuity in single-cell sequencing.BulkTrajBlend first deconvolves single-cell data from Bulk RNA-seq and then uses a GNN-based overlapping community discovery algorithm to identify contiguous cells in the generated single-cell data.
Colab_Reproducibility:https://colab.research.google.com/drive/1HulVXQIlUEcpGRDZo4MxcHYOjnVhuCC-?usp=sharing
import omicverse as ov
from omicverse.utils import mde
import scanpy as sc
import scvelo as scv
ov.utils.ov_plot_set()
2023-05-27 17:29:28.741101: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-05-27 17:29:29.211951: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory 2023-05-27 17:29:29.212047: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory 2023-05-27 17:29:29.212054: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
/mnt/data/env/pyomic/lib/python3.8/site-packages/phate/__init__.py /mnt/data/env/pyomic/lib/python3.8/site-packages/phate/__init__.py
loading data¶
For illustration, we apply differential kinetic analysis to dentate gyrus neurogenesis, which comprises multiple heterogeneous subpopulations.
We utilized single-cell RNA-seq data (GEO accession: GSE95753) obtained from the dentate gyrus of the hippocampus in rats, along with bulk RNA-seq data (GEO accession: GSE74985).
adata=scv.datasets.dentategyrus()
adata
AnnData object with n_obs × n_vars = 2930 × 13913
obs: 'clusters', 'age(days)', 'clusters_enlarged'
uns: 'clusters_colors'
obsm: 'X_umap'
layers: 'ambiguous', 'spliced', 'unspliced'
import numpy as np
bulk=ov.utils.read('GSE74985_mergedCount.txt.gz',index_col=0)
bulk=ov.bulk.Matrix_ID_mapping(bulk,'pair_GRCm39.tsv')
bulk.head()
| dg_d_1 | dg_d_2 | dg_d_3 | dg_v_1 | dg_v_2 | dg_v_3 | ca4_1 | ca4_2 | ca4_3 | ca3_d_1 | ... | ca3_v_3 | ca2_1 | ca2_2 | ca2_3 | ca1_d_1 | ca1_d_2 | ca1_d_3 | ca1_v_1 | ca1_v_2 | ca1_v_3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Fnip2 | 784 | 301 | 339 | 659 | 924 | 988 | 494 | 269 | 394 | 309 | ... | 709 | 467 | 391 | 558 | 343 | 634 | 395 | 112 | 200 | 240 |
| Gm22713 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Zfp595 | 0 | 72 | 10 | 20 | 104 | 101 | 66 | 30 | 131 | 80 | ... | 144 | 73 | 64 | 81 | 92 | 60 | 45 | 6 | 1 | 10 |
| Treh | 0 | 0 | 0 | 6 | 0 | 0 | 1 | 2 | 0 | 0 | ... | 0 | 1 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| Cat | 1299 | 539 | 492 | 489 | 502 | 779 | 1370 | 882 | 1231 | 1595 | ... | 1594 | 805 | 694 | 917 | 1506 | 1325 | 1053 | 299 | 257 | 372 |
5 rows × 24 columns
Configure the BulkTrajBlend model¶
Here, we import the bulk RNA-seq and scRNA-seq data we have just prepared as input into the BulkTrajBlend model. We use the lazy function for preprocessing and we note that dg_d represents the neuronal data of the dentate gyrus, which we merge as it is three replicates.
Note that the bulk RNA-seq and scRNA-seq we use here are raw data, not normalised and logarithmic, and are not suitable for use with the lazy function if your data has already been processed. It is important to note that single cell data cannot be scale
bulktb=ov.bulk2single.BulkTrajBlend(bulk_seq=bulk,single_seq=adata,
celltype_key='clusters',)
bulktb.bulk_preprocess_lazy(group=['dg_d_1','dg_d_2','dg_d_3'])
bulktb.single_preprocess_lazy()
......drop duplicates index in bulk data
......deseq2 normalize the bulk data
......log10 the bulk data
......calculate the mean of each group
......normalize the single data
normalizing counts per cell
finished (0:00:00)
......log1p the single data
Training the beta-VAE model¶
We first generated single cell data from the bulk RNA-seq data using beta-VAE and filtered out noisy cells using the size of the leiden as a constraint.
cell_target_num represents the expected number of cells in each category and we do not use a least squares approach to fit the cell proportions here.
bulktb.vae_configure(cell_target_num=100)
'''
bulktb.vae_train(batch_size=256,
learning_rate=1e-4,
hidden_size=256,
epoch_num=3500,
vae_save_dir='data/dg_d/newdata/save_model',
vae_save_name='dgd1_vae',
generate_save_dir='data/dg_d/newdata/output',
generate_save_name='dgd')
'''
bulktb.vae_load('data/dg_d/newdata/save_model/dgd_vae.pth')
...loading data
ranking genes
finished: added to `.uns['rank_genes_groups']`
'names', sorted np.recarray to be indexed by group ids
'scores', sorted np.recarray to be indexed by group ids
'logfoldchanges', sorted np.recarray to be indexed by group ids
'pvals', sorted np.recarray to be indexed by group ids
'pvals_adj', sorted np.recarray to be indexed by group ids (0:00:02)
loading model from data/dg_d/newdata/save_model/dgd_vae.pth
loading model from data/dg_d/newdata/save_model/dgd_vae.pth
generate_adata=bulktb.vae_generate(leiden_size=25)
...generating
generating: 100%|██████████████████| 1400/1400 [00:00<00:00, 3263.67it/s]
generated done! extracting highly variable genes
finished (0:00:00)
--> added
'highly_variable', boolean vector (adata.var)
'means', float vector (adata.var)
'dispersions', float vector (adata.var)
'dispersions_norm', float vector (adata.var)
computing PCA
Note that scikit-learn's randomized PCA might not be exactly reproducible across different computational platforms. For exact reproducibility, choose `svd_solver='arpack'.`
on highly variable genes
with n_comps=100
finished (0:00:00)
computing neighbors
finished: added to `.uns['neighbors']`
`.obsp['distances']`, distances for each pair of neighbors
`.obsp['connectivities']`, weighted adjacency matrix (0:00:01)
running Leiden clustering
finished: found 32 clusters and added
'leiden', the cluster labels (adata.obs, categorical) (0:00:00)
The filter leiden is ['18', '17', '16', '15', '14', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31']
ov.bulk2single.bulk2single_plot_cellprop(generate_adata,celltype_key='clusters',
)
Visualize the generate scRNA-seq¶
To visualize the generate scRNA-seq’s learned embeddings, we use the pymde package wrapperin omicverse. This is an alternative to UMAP that is GPU-accelerated.
import scanpy as sc
from omicverse.utils import mde
generate_adata.obsm["X_mde"] = mde(generate_adata.obsm["X_pca"])
sc.pl.embedding(generate_adata,basis='X_mde',color=['clusters'],wspace=0.4,
palette=ov.utils.pyomic_palette())
Training the GNN model¶
Next, we used GNN to look for overlapping communities (community = cell type) in the generated single-cell data.
- gpu: The GPU ID for training the GNN model. Default is 0.
- hidden_size: The hidden size for the GNN model. Default is 128.
- weight_decay: The weight decay for the GNN model. Default is 1e-2.
- dropout: The dropout for the GNN model. Default is 0.5.
- batch_norm: Whether to use batch normalization for the GNN model. Default is True.
- lr: The learning rate for the GNN model. Default is 1e-3.
- max_epochs: The maximum epoch number for training the GNN model. Default is 500.
- display_step: The display step for training the GNN model. Default is 25.
- balance_loss: Whether to use the balance loss for training the GNN model. Default is True.
- stochastic_loss: Whether to use the stochastic loss for training the GNN model. Default is True.
- batch_size: The batch size for training the GNN model. Default is 2000.
- num_workers: The number of workers for training the GNN model. Default is 5.
bulktb.gnn_configure(max_epochs=2000)
torch have been install version: 1.13.0+cu117
There are many parameters that can be controlled during training, here we set them all to the default
- thresh: the threshold for filtered the overlap community
- gnn_save_dir: the save dir for gnn model
- gnn_save_name: the gnn model name to save
bulktb.gnn_train()
Epoch 675, loss.full = 0.1612, nmi = 0.68: 0%| | 0/2000 [00:16<?, ?it/
Breaking due to early stopping at epoch 675 Final nmi = 0.675 ......add nocd result to adata.obs ...save trained gnn in save_model/gnn.pth.
Since the previously generated single cell data has a random nature in the construction of the neighbourhood map, the model must be loaded on the fixed generated single cell data. Otherwise an error will be reported
#bulktb.gnn_load('save_model/gnn.pth')
We can use GNN to get an overlapping community for each cell.
res_pd=bulktb.gnn_generate()
res_pd.head()
| nocd_Mossy | nocd_Granule immature | nocd_Microglia | nocd_Radial Glia-like | nocd_GABA | nocd_Granule mature | nocd_OL | nocd_Endothelial | nocd_Cck-Tox | nocd_Neuroblast | nocd_Cck-Tox_3 | nocd_Cajal Retzius | nocd_Astrocytes | nocd_OPC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| C_1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| C_2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| C_3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 |
| C_4 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| C_5 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
bulktb.nocd_obj.adata.obsm["X_mde"] = mde(bulktb.nocd_obj.adata.obsm["X_pca"])
sc.pl.embedding(bulktb.nocd_obj.adata,basis='X_mde',color=['clusters','nocd_n'],wspace=0.4,
palette=ov.utils.pyomic_palette())
sc.pl.embedding(bulktb.nocd_obj.adata[~bulktb.nocd_obj.adata.obs['nocd_n'].str.contains('-')],
basis='X_mde',
color=['clusters','nocd_n'],
wspace=0.4,palette=sc.pl.palettes.default_102)
Interpolation of the "interruption" cell¶
A simple function is provided to interpolate the "interruption" cells in the original data, making the single cell data continuous.
print('raw cells: ',bulktb.single_seq.shape[0])
#adata1=bulktb.interpolation('Neuroblast')
adata1=bulktb.interpolation('OPC')
print('interpolation cells: ',adata1.shape[0])
raw cells: 2930 interpolation cells: 3061
Visualisation of single cell data before and after interpolation¶
adata1.raw = adata1
sc.pp.highly_variable_genes(adata1, min_mean=0.0125, max_mean=3, min_disp=0.5)
adata1 = adata1[:, adata1.var.highly_variable]
sc.pp.scale(adata1, max_value=10)
extracting highly variable genes
finished (0:00:00)
--> added
'highly_variable', boolean vector (adata.var)
'means', float vector (adata.var)
'dispersions', float vector (adata.var)
'dispersions_norm', float vector (adata.var)
... as `zero_center=True`, sparse input is densified and may lead to large memory consumption
sc.tl.pca(adata1, n_comps=100, svd_solver="auto")
computing PCA
Note that scikit-learn's randomized PCA might not be exactly reproducible across different computational platforms. For exact reproducibility, choose `svd_solver='arpack'.`
on highly variable genes
with n_comps=100
finished (0:00:02)
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
adata.raw = adata
sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5)
adata = adata[:, adata.var.highly_variable]
sc.pp.scale(adata, max_value=10)
normalizing counts per cell
finished (0:00:00)
extracting highly variable genes
finished (0:00:00)
--> added
'highly_variable', boolean vector (adata.var)
'means', float vector (adata.var)
'dispersions', float vector (adata.var)
'dispersions_norm', float vector (adata.var)
... as `zero_center=True`, sparse input is densified and may lead to large memory consumption
sc.tl.pca(adata, n_comps=100, svd_solver="auto")
computing PCA
Note that scikit-learn's randomized PCA might not be exactly reproducible across different computational platforms. For exact reproducibility, choose `svd_solver='arpack'.`
on highly variable genes
with n_comps=100
finished (0:00:01)
adata.obsm["X_mde"] = mde(adata.obsm["X_pca"])
adata1.obsm["X_mde"] = mde(adata1.obsm["X_pca"])
Visualisation of the proposed time series trajectory of cells before and after interpolation¶
Here, we use pyVIA to complete the calculation of the pseudotime .
v0 = ov.single.pyVIA(adata=adata,adata_key='X_pca',adata_ncomps=100, basis='X_mde',
clusters='clusters',knn=20,random_seed=4,root_user=['nIPC'],
dataset='group')
v0.run()
2023-05-27 17:30:57.081469 Running VIA over input data of 2930 (samples) x 100 (features)
2023-05-27 17:30:57.081518 Knngraph has 20 neighbors
2023-05-27 17:30:58.095812 Finished global pruning of 20-knn graph used for clustering at level of 0.15. Kept 36.5 % of edges.
2023-05-27 17:30:58.104748 Number of connected components used for clustergraph is 1
2023-05-27 17:30:58.163854 Commencing community detection
2023-05-27 17:30:58.191759 Finished running Leiden algorithm. Found 527 clusters.
2023-05-27 17:30:58.192455 Merging 500 very small clusters (<10)
2023-05-27 17:30:58.196490 Finished detecting communities. Found 27 communities
2023-05-27 17:30:58.196631 Making cluster graph. Global cluster graph pruning level: 0.15
2023-05-27 17:30:58.201019 Graph has 1 connected components before pruning
2023-05-27 17:30:58.202458 Graph has 10 connected components after pruning
2023-05-27 17:30:58.207689 Graph has 1 connected components after reconnecting
2023-05-27 17:30:58.208068 0.0% links trimmed from local pruning relative to start
2023-05-27 17:30:58.208084 66.5% links trimmed from global pruning relative to start
2023-05-27 17:30:58.210067 Starting make edgebundle viagraph...
2023-05-27 17:30:58.210078 Make via clustergraph edgebundle
2023-05-27 17:30:59.875785 Hammer dims: Nodes shape: (27, 2) Edges shape: (106, 3)
2023-05-27 17:30:59.877662 component number 0 out of [0]
2023-05-27 17:30:59.889485\group root method
2023-05-27 17:30:59.889499
or component 0, the root is nIPC and ri nIPC
2023-05-27 17:30:59.895612 New root is 19 and majority nIPC
2023-05-27 17:30:59.896246 Computing lazy-teleporting expected hitting times
2023-05-27 17:31:00.607411 Identifying terminal clusters corresponding to unique lineages...
2023-05-27 17:31:00.607490 Closeness:[5, 7, 9, 11, 12, 13, 14, 15, 16, 18, 20]
2023-05-27 17:31:00.607501 Betweenness:[5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25]
2023-05-27 17:31:00.607507 Out Degree:[3, 5, 7, 9, 11, 12, 13, 14, 16, 18, 20, 22, 23, 24, 25]
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
2023-05-27 17:31:00.607860 Terminal clusters corresponding to unique lineages in this component are [5, 7, 9, 11, 15, 16, 18, 20, 22, 23, 24, 25]
2023-05-27 17:31:00.958793 From root 19, the Terminal state 5 is reached 8 times.
2023-05-27 17:31:01.320944 From root 19, the Terminal state 7 is reached 61 times.
2023-05-27 17:31:01.691167 From root 19, the Terminal state 9 is reached 21 times.
2023-05-27 17:31:02.068806 From root 19, the Terminal state 11 is reached 5 times.
2023-05-27 17:31:02.445899 From root 19, the Terminal state 15 is reached 6 times.
2023-05-27 17:31:02.820311 From root 19, the Terminal state 16 is reached 5 times.
2023-05-27 17:31:03.129110 From root 19, the Terminal state 18 is reached 559 times.
2023-05-27 17:31:03.501901 From root 19, the Terminal state 20 is reached 5 times.
2023-05-27 17:31:03.868573 From root 19, the Terminal state 22 is reached 55 times.
2023-05-27 17:31:04.218262 From root 19, the Terminal state 23 is reached 564 times.
2023-05-27 17:31:04.540027 From root 19, the Terminal state 24 is reached 249 times.
2023-05-27 17:31:04.918572 From root 19, the Terminal state 25 is reached 5 times.
2023-05-27 17:31:04.949857 Terminal clusters corresponding to unique lineages are {5: 'Astrocytes', 7: 'Microglia', 9: 'Mossy', 11: 'Endothelial', 15: 'GABA', 16: 'Cajal Retzius', 18: 'Cck-Tox', 20: 'GABA', 22: 'Endothelial', 23: 'Granule mature', 24: 'Granule immature', 25: 'Endothelial'}
2023-05-27 17:31:04.949890 Begin projection of pseudotime and lineage likelihood
2023-05-27 17:31:05.451932 Graph has 1 connected components before pruning
2023-05-27 17:31:05.453837 Graph has 14 connected components after pruning
2023-05-27 17:31:05.461307 Graph has 1 connected components after reconnecting
2023-05-27 17:31:05.461699 56.6% links trimmed from local pruning relative to start
2023-05-27 17:31:05.461720 58.5% links trimmed from global pruning relative to start
2023-05-27 17:31:05.463370 Start making edgebundle milestone...
2023-05-27 17:31:05.463395 Start finding milestones
2023-05-27 17:31:06.076573 End milestones
2023-05-27 17:31:06.076735 Will use via-pseudotime for edges, otherwise consider providing a list of numeric labels (single cell level) or via_object
2023-05-27 17:31:06.082741 Recompute weights
2023-05-27 17:31:06.105145 pruning milestone graph based on recomputed weights
2023-05-27 17:31:06.106096 Graph has 1 connected components before pruning
2023-05-27 17:31:06.106678 Graph has 10 connected components after pruning
2023-05-27 17:31:06.113743 Graph has 1 connected components after reconnecting
2023-05-27 17:31:06.114462 66.8% links trimmed from global pruning relative to start
2023-05-27 17:31:06.114484 regenerate igraph on pruned edges
2023-05-27 17:31:06.120350 Setting numeric label as single cell pseudotime for coloring edges
2023-05-27 17:31:06.130257 Making smooth edges
2023-05-27 17:31:07.058497 Time elapsed 9.5 seconds
v1 = ov.single.pyVIA(adata=adata1,adata_key='X_pca',adata_ncomps=100, basis='X_mde',
clusters='clusters',knn=15,random_seed=4,root_user=['Neuroblast'],
#jac_std_global=0.01,
dataset='group')
v1.run()
2023-05-27 17:31:07.252008 Running VIA over input data of 3061 (samples) x 100 (features)
2023-05-27 17:31:07.252050 Knngraph has 15 neighbors
2023-05-27 17:31:08.305446 Finished global pruning of 15-knn graph used for clustering at level of 0.15. Kept 36.0 % of edges.
2023-05-27 17:31:08.314440 Number of connected components used for clustergraph is 1
2023-05-27 17:31:08.356977 Commencing community detection
2023-05-27 17:31:08.385129 Finished running Leiden algorithm. Found 516 clusters.
2023-05-27 17:31:08.385848 Merging 492 very small clusters (<10)
2023-05-27 17:31:08.389338 Finished detecting communities. Found 32 communities
2023-05-27 17:31:08.389502 Making cluster graph. Global cluster graph pruning level: 0.15
2023-05-27 17:31:08.393396 Graph has 1 connected components before pruning
2023-05-27 17:31:08.394926 Graph has 15 connected components after pruning
2023-05-27 17:31:08.402777 Graph has 1 connected components after reconnecting
2023-05-27 17:31:08.403163 0.0% links trimmed from local pruning relative to start
2023-05-27 17:31:08.403177 62.3% links trimmed from global pruning relative to start
2023-05-27 17:31:08.405583 Starting make edgebundle viagraph...
2023-05-27 17:31:08.405596 Make via clustergraph edgebundle
2023-05-27 17:31:08.522840 Hammer dims: Nodes shape: (32, 2) Edges shape: (122, 3)
2023-05-27 17:31:08.524690 component number 0 out of [0]
2023-05-27 17:31:08.537176\group root method
2023-05-27 17:31:08.537188
or component 0, the root is Neuroblast and ri Neuroblast
2023-05-27 17:31:08.542087 New root is 2 and majority Neuroblast
2023-05-27 17:31:08.542766 New root is 7 and majority Neuroblast
2023-05-27 17:31:08.543102 New root is 10 and majority Neuroblast
2023-05-27 17:31:08.545090 Computing lazy-teleporting expected hitting times
2023-05-27 17:31:09.347430 Identifying terminal clusters corresponding to unique lineages...
2023-05-27 17:31:09.347511 Closeness:[5, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22]
2023-05-27 17:31:09.347524 Betweenness:[3, 5, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 29, 31]
2023-05-27 17:31:09.347530 Out Degree:[2, 3, 5, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 31]
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
2023-05-27 17:31:09.348161 Terminal clusters corresponding to unique lineages in this component are [3, 5, 8, 12, 13, 14, 15, 16, 19, 23, 24, 25, 26, 29, 31]
2023-05-27 17:31:09.664370 From root 10, the Terminal state 3 is reached 620 times.
2023-05-27 17:31:10.078733 From root 10, the Terminal state 5 is reached 9 times.
2023-05-27 17:31:10.488539 From root 10, the Terminal state 8 is reached 6 times.
2023-05-27 17:31:10.895069 From root 10, the Terminal state 12 is reached 5 times.
2023-05-27 17:31:11.304508 From root 10, the Terminal state 13 is reached 5 times.
2023-05-27 17:31:11.719447 From root 10, the Terminal state 14 is reached 7 times.
2023-05-27 17:31:12.124411 From root 10, the Terminal state 15 is reached 5 times.
2023-05-27 17:31:12.548233 From root 10, the Terminal state 16 is reached 6 times.
2023-05-27 17:31:12.947485 From root 10, the Terminal state 19 is reached 62 times.
2023-05-27 17:31:13.357869 From root 10, the Terminal state 23 is reached 5 times.
2023-05-27 17:31:13.770306 From root 10, the Terminal state 24 is reached 13 times.
2023-05-27 17:31:14.144887 From root 10, the Terminal state 25 is reached 108 times.
2023-05-27 17:31:14.508558 From root 10, the Terminal state 26 is reached 185 times.
2023-05-27 17:31:14.845109 From root 10, the Terminal state 29 is reached 625 times.
2023-05-27 17:31:15.240867 From root 10, the Terminal state 31 is reached 99 times.
2023-05-27 17:31:15.272587 Terminal clusters corresponding to unique lineages are {3: 'Granule mature', 5: 'Astrocytes', 8: 'OPC', 12: 'Mossy', 13: 'Endothelial', 14: 'Radial Glia-like', 15: 'OL', 16: 'OPC', 19: 'Cck-Tox', 23: 'Endothelial', 24: 'Granule mature', 25: 'Granule mature', 26: 'Granule mature', 29: 'Granule mature', 31: 'Granule mature'}
2023-05-27 17:31:15.272619 Begin projection of pseudotime and lineage likelihood
2023-05-27 17:31:15.807564 Graph has 1 connected components before pruning
2023-05-27 17:31:15.809487 Graph has 18 connected components after pruning
2023-05-27 17:31:15.819261 Graph has 1 connected components after reconnecting
2023-05-27 17:31:15.819656 54.9% links trimmed from local pruning relative to start
2023-05-27 17:31:15.819674 54.9% links trimmed from global pruning relative to start
2023-05-27 17:31:15.821453 Start making edgebundle milestone...
2023-05-27 17:31:15.821477 Start finding milestones
2023-05-27 17:31:16.367882 End milestones
2023-05-27 17:31:16.368052 Will use via-pseudotime for edges, otherwise consider providing a list of numeric labels (single cell level) or via_object
2023-05-27 17:31:16.372785 Recompute weights
2023-05-27 17:31:16.391992 pruning milestone graph based on recomputed weights
2023-05-27 17:31:16.392878 Graph has 1 connected components before pruning
2023-05-27 17:31:16.393428 Graph has 8 connected components after pruning
2023-05-27 17:31:16.398934 Graph has 1 connected components after reconnecting
2023-05-27 17:31:16.399609 67.8% links trimmed from global pruning relative to start
2023-05-27 17:31:16.399630 regenerate igraph on pruned edges
2023-05-27 17:31:16.405708 Setting numeric label as single cell pseudotime for coloring edges
2023-05-27 17:31:16.415589 Making smooth edges
2023-05-27 17:31:17.174809 Time elapsed 9.4 seconds
import matplotlib.pyplot as plt
fig,ax=v0.plot_stream(basis='X_mde',clusters='clusters',
density_grid=0.8, scatter_size=30, scatter_alpha=0.3, linewidth=0.5)
plt.title('Raw Dentategyrus',fontsize=12)
#fig.savefig('figures/v0_via_fig4.png',dpi=300,bbox_inches = 'tight')
fig,ax=v1.plot_stream(basis='X_mde',clusters='clusters',
density_grid=0.8, scatter_size=30, scatter_alpha=0.3, linewidth=0.5)
plt.title('Interpolation Dentategyrus',fontsize=12)
#fig.savefig('figures/v1_via_fig4.png',dpi=300,bbox_inches = 'tight')
fig,ax=v0.plot_stream(basis='X_mde',density_grid=0.8, scatter_size=30, color_scheme='time', linewidth=0.5,
min_mass = 1, cutoff_perc = 5, scatter_alpha=0.3, marker_edgewidth=0.1,
density_stream = 2, smooth_transition=1, smooth_grid=0.5)
plt.title('Raw Dentategyrus\nPseudoTime',fontsize=12)
fig,ax=v1.plot_stream(basis='X_mde',density_grid=0.8, scatter_size=30, color_scheme='time', linewidth=0.5,
min_mass = 1, cutoff_perc = 5, scatter_alpha=0.3, marker_edgewidth=0.1,
density_stream = 2, smooth_transition=1, smooth_grid=0.5)
plt.title('Interpolation Dentategyru\nPseudoTime',fontsize=12)